refactor: establish testability as core development practice#19
Merged
refactor: establish testability as core development practice#19
Conversation
Create language mapping utilities for file extension handling: - getExtensionForLanguage(): Map language names to file extensions - getSupportedLanguages(): Get list of supported languages - isLanguageSupported(): Check language support - getLanguageFromExtension(): Reverse lookup extension → language These utilities provide the foundation for language-specific indexing and file filtering in the repository indexer. Added 31 comprehensive tests including: - All supported languages (TypeScript, JavaScript, Python, Go, Rust, Markdown) - Case-insensitive handling - Unknown language fallback - Bidirectional mapping (round-trip) - Integration scenarios 100% coverage on all language utilities.
Create document formatting utilities for enhanced embedding quality: - formatDocumentText(): Combine type, name, and content for better semantic search - formatDocumentTextWithSignature(): Include function signatures for enhanced searchability - truncateText(): Limit text length while preserving important content - cleanDocumentText(): Normalize whitespace and reduce token count These utilities are independent and provide flexible text formatting options for optimizing embedding generation and search relevance. Added 27 comprehensive tests including: - Basic formatting with name/text/signature combinations - Empty and edge case handling - Multiline text preservation - Truncation with ellipsis (various lengths) - Whitespace normalization (spaces, newlines, tabs) - Integration scenarios (format → clean → truncate pipeline) 100% coverage on all formatting utilities.
Create document transformation utilities for embedding pipeline: - prepareDocumentsForEmbedding(): Transform Document[] → EmbeddingDocument[] - prepareDocumentForEmbedding(): Single document variant for incremental indexing - filterDocumentsByExport(): Filter by public/private API - filterDocumentsByType(): Filter by document type (function, class, etc.) - filterDocumentsByLanguage(): Case-insensitive language filtering These utilities depend on formatting utilities (formatDocumentText) and provide the critical transformation layer between repository scanning and vector storage. Added 29 comprehensive tests including: - Document transformation with full metadata - Single and batch operations - Export status filtering (public API extraction) - Type-based filtering (functions, classes, etc.) - Case-insensitive language filtering - Integration scenarios (chained filters + preparation) 100% coverage on all document utilities.
Wire up all utility modules and update RepositoryIndexer: Integration changes: - Create utils/index.ts barrel export for clean imports - Update RepositoryIndexer to import from modular utils - Remove old private methods (now extracted as utilities): * prepareDocumentsForEmbedding → utils/documents.ts * formatDocumentText → utils/formatting.ts * getExtensionForLanguage → utils/language.ts Testing: - All 126 tests passing (87 utils + 39 integration) - Backward compatibility maintained - No behavior changes, only refactoring This completes the modular refactoring, providing: - Clean separation of concerns (language → formatting → documents) - Tree-shakeable exports for optimal bundling - Self-contained, testable modules - 100% coverage on all utility modules Benefits: - Improved testability (direct unit tests vs integration-only) - Better code organization (SRP) - Easier to understand and maintain - Ready for reuse in other packages
Add comprehensive documentation and tooling to make testability the default way of working: Documentation: - TESTABILITY.md: Complete guide with principles, examples, checklists - FEATURE_TEMPLATE.md: Step-by-step template for new features - Updated CONTRIBUTING.md with testability section - PR template with testability checklist Key principles enshrined: 1. Extract pure functions to utils/ modules 2. 100% coverage on utilities 3. No non-null assertions (!) 4. Organize by domain, not "misc" 5. Atomic commits with clear dependencies Coverage targets: - Pure utilities: 100% - Integration: >80% - CLI/UI: >60% Scripts: - Added test:coverage command for easy coverage checks Real examples referenced: - Explorer subagent (99 tests, 100% utils coverage) - Repository indexer (87 tests, 100% utils coverage) This establishes testability as a first-class concern and provides clear guidance for all future development.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 Overview
Establishes testability and modular design as core development practices through documentation, tooling, and exemplary implementations.
📚 Documentation Added
1. TESTABILITY.md - Comprehensive Guide
2. FEATURE_TEMPLATE.md - Step-by-Step Template
3. Updated CONTRIBUTING.md
4. PR Template
🏗️ Exemplary Implementations
Indexer Utils (This PR)
Refactored Repository Indexer with modular utilities:
Benefits:
Previously: Explorer Subagent (PR #18)
📊 Coverage Targets (Now Official)
🛠️ Tooling
test:coveragescript for easy coverage checks🎓 Enforcement Strategy
Soft Enforcement (Education)
Process Enforcement
Future: Hard Enforcement (Optional)
!assertions🏆 Key Principles Established
🔗 Commits
This PR contains 5 granular commits:
feat(indexer): add language utilities (foundation)- No dependencies, 31 testsfeat(indexer): add formatting utilities (independent)- Independent, 27 testsfeat(indexer): add document preparation utilities- Depends on formatting, 29 testsrefactor(indexer): integrate modular utils architecture- Wire everything togetherdocs: enshrine testability as core development practice- Documentation & toolingEach commit builds and tests independently.
✅ Testing
All 126 tests passing:
pnpm vitest run packages/core/src/indexer --coverage # 100% statements, 88.88% branches, 100% functions on utils📖 For Reviewers
This PR establishes patterns that will guide all future development:
🚀 Impact